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\Q | The notion of profile appeared in the 1970s decade, which was mainly due to the 
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Abstract 



need to create custom applications that could be adapted to the user. In what fol- 
lows, we treat the different aspects of the user's profile, defining it, profile, its fea- 
tures and its indicators of interest, and then we describe the different approaches 
of modelling and acquiring the user's interests. 
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1 Introduction 

025 



those likely to be of interest lBaeza- Yates and Ribeiro-Netol [ 1999] 
1.1 User's Profile Features 



Recommender System (RS) belong to a more general framework of systems called Personalized 
Access Systems. These systems int egrate the u s er as an information structure, in the process of 
selecting relevant information to him lShani et al.\ Il2007ll . 

In a RS, the user's profile is a kind of query about his interests, describing features that can be shared 
with a group of individuals. Comparing these features to incoming documents, the system can select 
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The features that characterize the user's profile are his acquired know ledge in different subjects 

■ (background), his objectives (goals) and his interests [Brusilovskvl Il200 111 . 

Background: The background concerns all the information related to user's past experience. This 
includes his profession, experience in fields related to his work, and how the user is familiar with 
the working environment of the system. 

Objectives: The objectives are the user's needs when he searches for information. 

Interests: The interests are the documents that the user has consulted or notified to the system 

directly or indirectly through its feedback by indicators of interests. 
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1.2 Indicators of Interests 



The implicit detection of interests is through observable behaviours collected by the system when 
the user interacts with his environment. In this context several behaviours can be considered, such 
as: 

- Click of the mouse on a document; 

- Scrolling a document using the mouse or keyboard; 

- Time spent on a document; 

- Printing a document; 

- Saving a document; 

- Copying/pasting all or part of a document; 

- Document annotation; 
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- Qualitative evaluation of a document; 

- Navigating to reach a document; 

- Eye-tracking. 

These behaviours during user/system interactions are indicators of the relevance of a document and 
provide useful evidence to predict interests indirectly expressed by the user. 
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The structures that store the user's profile features are modelled as described in what follows. 
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2 Modelling the User's Profile 



User modelling is a research f ield that concerns the improvement of man-machine interaction by 
predicting the interests of users Shani etal\ |2007ll . 

Modelling the user's profile consists of designing a structure for storing all the informat ion which 
charac terizes the user and describes his interests, his background and his objectives ISalton et al\ 
lfl975t 

There are several ways to model or represent the user's profile. We describe here the different 
existing techniques. 

071 

2.1 Vector Representation 

073 

Vector representation is based on the classic vector space model of Salton lSalton et a/.llll975ll . where 
the profile is represented as an m-dimensional vector, where each dimension corresponds to a dis- 
tinct term and m is the total number of terms that exist in the user's profile. 

The vector representation has been the first model of the user's profile exploite d. The weighting o f 
terms is usually based on a diagram of the TF/IDF format commonly used in IR Salt on et alllll975ll . 
The weight associated to each term represents the degree of importance in the user's profile. Differ- 
ent RS use such representation, like lMukhopadhvav et al.\ Il2008ll an on line newspaper or lChan et al.\ 
ll201lll concerning the recommendation of web services. 

In addition to its simplicity of implementation, the use of several vectors to represent the profile 
permits to take into account the different interests and their evolution through time; but the default 
of this representation is in the lacks of structure and semantic (no connexion between terms) . 
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2.2 Connexion Representation 
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The connexion representation is based on an associative interconnection of the nodes representing 
the user's profile. 

In this context, the system proposed by Wibo woef al.\ 11201 111 uses the concepts existing in Word-Net 
to group similar terms. The user's profile is then represented as a semantic network in which each 
group of concepts is represented by n odes and arcs. 

A similar approach has been used by Mezgh ani et al.\ ||2012]. Initially, each semantic network con- 
tains a collection of nodes in which each node represents a concept. The nodes contain a single 
vector of weighted terms. When a new user information is collected, the profile is enriched by inte- 
grating weighted terms in the corresponding concepts. 

This representation has the advantage of structuring Wibo wo et al.\ 11201 111 , but the problem of the 
connexion representation is the absence of hierarchical relation among the concepts of the network, 
reflecting the semantic generalization / specialization between concepts (e. g., Biology is one spe- 
cialization of Science). 
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2.3 Ontologies Representation 
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Ontologies may be used to represent the semantic relations among the informational units that make 
the user's profile Middleto n et al.\ J2004]. This representation allows to overcome the limitations of 
the connexion representation by presenting the user's profile in the form of a hierarchy of concepts. 
Each class in the hierarchy represents the knowledge of an area of the user's interests. The relation- 
ship (generalization / specification) between the elements of the hierarchy reflects a more realistic 
interest of the user. 

The representation of the profile based on ontologies creates some problems related to the hetero- 
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2.4 Multidimensional Representation 
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geneity and diversity of the user's interests, for-instance, the users may have different pe rceptions of 
the same concept, which leads to inaccurate representations God ov and Am andi [2005]]. 
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The user's profile can contain several types of infor mation such as demographics, interests, purpose, 
hist ory and other informa tion iNiederee et al\ J2004]. 

In lLakiotaki et al\ ||201 ill , the authors represent the contents of the user's profile by a structured 
model with a predefined category called dimensions: personal data, data source, data delivery, be- 
havioural data and security data. This model was proposed in the development of a digital library 
service. 

In the same context, authors in Kostadi nov et al.\ Il2007ll propose a set of open dimensions that can 
contain most of the information that characterizes the user. The authors propose eight dimensions: 
personal data, the focus, the domain ontology, the expected quality, customization, security, prefer- 
ences, miscellaneous information. 

The multidimensional representation has the advantage of providing a better interpretation of the 
semantics of the user's profile. 
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3 Acquiring the User's Profile 



In the recommendation process, the crucial step is the construction of the user's profile that truly 
reflects his interest s . How ever, this step is not an easy task because the user may not be sure on his 
interests [Bila et al] J2008], on one hand and often does not want or even can not make efforts for its 
creation, on the other hand. 
In this context, several approaches have been proposed for the acquisition of the user's profile. 
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3.1 Simplistic Approach 
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This approach promo tes the description of the user's profi le through a set of keywords explicitly 
provided by the user iBaeza- Yates and Ribeiro- Netol (1999]. The method requires much from the 
user because, if he is not familiar with the system and the vocabulary of incoming documents, it 
becomes difficult for him to provide the proper keywords that describe his interests. 
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3.2 Dynamic Approach 

141 

An alternative approach is to build the user's profile by dynamically collecting information about 
his preferences. At the first step, the user must provide a set of keywords describing preferences to 
initialize the profile; then, at each arrival of a new document, the system uses the user's profile to 
select the documents potentially fitting his interests, and displays them to the user. The user indicates 
one relevant document and one irrelevant document to him. This in formation is used to adjust the 
description of the user's profile in order to reflect his new preferences lBaeza- Yates and Ribeiro-Netol 
Ill999li . 
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3.3 Machine Learning Approach 
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Machine learning is an essential step in designing an automatic RS lLopez-Lopez et al.\ ||2()()9I. As it 
is not reasonable to ask the user a set of keywords describing his preferences, the idea is to observe 
the user's behaviour through his interactions with the system to learn his profile. 
Most systems construct the user's profile by learning from consulted documents. This profile is gen- 
erally based on the vector model and different indicators of interests are used, such as the movements 
and mouse clicks, for example. The terms weight adjustment is often used with learning techniques 
such as neural networks, genetic algorithms and others Anderson [2002l- 
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4 Conclusion and Discussion 
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We present now a synthesis of the different approaches to represent and acquire the user's profile 
discussed above. 
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Table 1 : Representing the user's profile 



Representation technique 


Advantage 


Disadvantage 


Vector representation 


Takes into account the diversity of 
interests and their evolution through 
time and is easy to implement 


The lack of structure of the data 


Connectionist representa- 
tion 


Semantic relationships between in- 
terests 


The absence of hierarchical rela- 
tions between the network's con- 
cepts 


Ontology representation 


Hierarchical relations between the 
interests of the user's profile 


Problems related to the heterogene- 
ity and the diversity of the user's in- 
terests 


Multidimensional represen- 
tation 


Better interpretation of the seman- 
tics of the user's profile 


Ambiguity in interpreting the roles 
of each dimension 



Table Q] summarizes the advantages and disadvantages of different approaches on user's profile 
modelling. From the table, we note that the connectionist representation solves the shortcomings 
of the vector representation by establishing relationships between interests in the user's profile. 
Moreover, the ontology representation allows to expand the limits of connexion representation by 
including a hierarchy between interests. Finally, the multidimensional representation allows a better 
interpretation of the semantics of the user's profile. 

Table 2: acquisition of the user's profile 



Representation technique 


Advantage 


Disadvantage 


Simplistic approach 


The system is sure on the user's in- 
terests 


Hard for the user to provide the 
appropriate keywords, the profile 
is static and the system is user- 
dependent 


Dynamic approach 


The system is sure on the user's 
interests, it allows the system to 
change of the profile 


Hard for the user to provide the ap- 
propriate keywords and the system 
is user-dependent 


Machine learning approach 


Learning is not restricted to the key- 
words already entered by the user 


The problem of cold start (no rec- 
ommendation when the user profile 
is empty) 



Table [2] summarizes the advantages and disadvantages of different user's profile acquisitions ap- 
proaches. Another observation is that the learning approach and the dynamic approach are comple- 
mentary, in the sense that the learning approach solves the user dependence found in the dynamic 
approach, and the dynamic approach solves the cold start problem that exists in the learning ap- 
proach. 

In short, the best approaches to model and acquire the user's profile are respectively a multidimen- 
sional approach and a hybrid between the dynamic and the learning approaches. 
The representation and acquisition of the user's profile is a part of the recommendation process. 
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